Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R21C: Revise distribution: add slurm-array; rev stats #296

Merged
merged 14 commits into from
Nov 7, 2024

Conversation

rtodling
Copy link
Collaborator

@rtodling rtodling commented Nov 1, 2024

This branch brings in the revised ensemble parallelization options implemented in develop (aimed at FP).

The only two procedures I am enabling slur-arrays are the atmens_recenter and post_egcm. I have worked so that the old-style parallelization works again for other tasks, such as observer and gcm calls; I have reset the DST variables in the AtmEnsConfig files to parallelize the ensemble as in the past.

In doing so, I also stumbled on the changes that had been made in atmens_stats.csh; I believe there was a bit more hardwiring in the changes than need be, so I revised things according. Particularly noticeable are:

  1. I see no need to do if related to creating time stamp of files to be handled (i.e., variables timetagz can be built automatically).

  2. also, I see no need to have multiple mp_statsXX.rc files controlling different streams; at least not for the reason these extra files were being used for - the simply had different number of levels. Turns out it is easy to get the number of levels on the fly and edit the standard mp_stats.rc also on the fly.

  3. A third thing has been there for a while, was the fact that hidden files associated w/ the successful termination of specific execs were being created in children directories rather than the parent directory. I revised this in develop, to follow the overall strategy in the ensemble design to place the hidden files change in the parent directory.

  4. The other thing I noticed in the stats is that mp_statsXX.rc is using 48 PEs to run the stats ... I think that's way too many PEs and causes the jobs to sit too long in the queue. I reduced this what what the x-exp setting is, namely, 4 PEs.

  5. Now, a changed (addition) has been placed in the atmens_stats in the R21C branch that calculates the mean and variance of fields in the ensdiag. That's good - this is something that had been missing - but there are a few issues w/ this:

5a) When the data in ensdiag is handled, hidden files get placed in the child directory (ensdiag) - I will rrevise this to that all hidden files are placed in the parent dir.
5b) I noticed, that mean and variance are not calculated for all files in the ensdiag/memMMM. The reason for this is that the post_egcm is somewhat hardwired to handle files in the background period of the model integration. This is easy to unwire (I am working on it), and allow for stats to be calculated for all output in ensdiag.
5c) HOWEVER: it should be noticed that the present spin up period (and a little of the actual streams) are calculating stats for diagnostic files from the GCM during the predictor period - which is not the GMAO convention: which is to produce products within the corrector phase of the integration. I can address this when tackily this item here (but, @elakkraoui , we need to talk and reach an agreement on this).

  1. also fixed are a couple of oversights associated with the ensemble diagnostics:
    6a) the aerosol diagnostic is an instantaneous file stream but is named as tagv in the post_egcm file controlling output, so no mean and variance files were being calculated for this stream.
    6b) the stream int_tavg_6hr_glo_L288x181_slv was coming out at an incorrect time because the stream in history was missing the entry for ref_time.

I have addressed the issues in (5) above; this required changing the API of post_egcm and changes to atm_ens.j - all minor, and all related to diagnostics. Now mean and variance are calculated for all diagnostics desired; notice that I introduced a post_egcm_diag.rc file that controls the diag statistics - this was done to avoid internal (unnecessary) logic inside post_egcm.

Copy link

github-actions bot commented Nov 1, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

Copy link

github-actions bot commented Nov 1, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

1 similar comment
Copy link

github-actions bot commented Nov 1, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

Copy link

github-actions bot commented Nov 2, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

1 similar comment
Copy link

github-actions bot commented Nov 2, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

mathomp4
mathomp4 previously approved these changes Nov 3, 2024
Copy link

github-actions bot commented Nov 4, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

Copy link

github-actions bot commented Nov 4, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

Copy link

github-actions bot commented Nov 4, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

Copy link

github-actions bot commented Nov 4, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

1 similar comment
Copy link

github-actions bot commented Nov 4, 2024

Label error. Requires at least 1 of: 0 diff, 0 diff trivial, Non 0-diff, 0 diff structural, 0-diff trivial, Not 0-diff, 0-diff, automatic, 0-diff uncoupled. Found:

@rtodling rtodling added enhancement New feature or request 0 diff The changes in this pull request have verified to be zero-diff with the target branch. bug fix labels Nov 4, 2024
@rtodling rtodling changed the title Revise distribution: add slurm-array; rev stats R21C: Revise distribution: add slurm-array; rev stats Nov 5, 2024
mathomp4
mathomp4 previously approved these changes Nov 5, 2024
@rtodling
Copy link
Collaborator Author

rtodling commented Nov 5, 2024

Just for the record, I will state here that after conversation with @elakkraoui @rlucches and @sdrabenh it was agreed that the ensemble diagnostic will be written at the PREDICTOR part of the 12-hour IAU integration - just as it is being written the spin up period (with fix to produce the aerosol output).

Since only fields in the predictor part are desirable, I am cutting down the output size and having history only write out the derivable fields during the predictor part as opposed to what is happening in the spin up period where a snap shot is also produced during the corrector part. These are not needed.

@rtodling
Copy link
Collaborator Author

rtodling commented Nov 7, 2024

@mathomp4 Hi Matt, can you help me get this in? - going into R21C projected branch. Thank you.

@mathomp4 mathomp4 marked this pull request as ready for review November 7, 2024 13:50
@mathomp4 mathomp4 requested review from a team as code owners November 7, 2024 13:50
@mathomp4
Copy link
Member

mathomp4 commented Nov 7, 2024

@rtodling I approved it and undrafted it. Feel free to merge at your pleasure.

@rtodling rtodling merged commit 2607c0d into R21C Nov 7, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 diff The changes in this pull request have verified to be zero-diff with the target branch. bug fix enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants